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Abstract 

In this paper, we take a control-theoretic approach to answering 
some standard questions in statistical mechanics, and use the results 
to derive limitations of classical measurements. A central problem is 
the relation between systems which appear macroscopically dissipa- 
tive but are microscopically lossless. We show that a linear system is 
dissipativc if, and only if, it can be approximated by a linear lossless 
system over arbitrarily long time intervals. Hence lossless systems are 
in this sense dense in dissipative systems. A linear active system can 
be approximated by a nonlinear lossless system that is charged with 
initial energy. As a by-product, we obtain mechanisms explaining the 
Onsager relations from time-reversible lossless approximations, and the 
fluctuation-dissipation theorem from uncertainty in the initial state of 
the lossless system. The results are applied to measurement devices 
and are used to quantify limits on the so-called observer effect, also 
called back action, which is the impact the measurement device has on 
the observed system. In particular, it is shown that deterministic back 
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action can be compensated by using active elements, whereas stochas- 
tic back action is unavoidable and depends on the temperature of the 
measurement device. 

1 Introduction 

Analysis and derivation of limitations on what is achievable are at the core 
of many branches of engineering, and thus of tremendous importance. Ex- 
amples can be found in estimation, information, and control theories. In 
estimation theory, the Cramer-Rao inequality gives a lower bound on the 
covariance of the estimation error, in information theory Shannon showed 
that the channel capacity gives an upper limit on the communication rate, 
and in control theory Bode's sensitivity integral bounds achievable control 
performance. For an overview of limitations in control and estimation, see 
the book [1]. Technology from all of these branches of engineering is used 
in parallel in modern networked control systems [2]. Much research effort 
is currently spent on understanding how the limitations from these fields 
interact. In particular, much effort has been spent on merging limitations 
from control and information theory, see for example [SHS] . This has yielded 
insight about how future control systems should be designed to maximize 
their performance and robustness. 

Derivation of limitations is also at the core of physics. Well-known exam- 
ples are the laws of thermodynamics in classical physics and the uncertainty 
principle in quantum mechanics [6HH] ■ The exact implications of these phys- 
ical limitations on the performance of control systems have received little 
attention, even though all components of a control system, such as actua- 
tors, sensors, and computers, are built from physical components which are 
constrained by physical laws. Control engineers discuss limitations in terms 
of location of unstable plant poles and zeros, saturation limits of actuators, 
and more recently channel capacity in feedback loops. But how does the 
amount of available energy limit the possible bandwidth of a control sys- 
tem? How does the ambient temperature affect the estimation error of an 
observer? How well can you implement a desired ideal behavior using phys- 
ical components? The main goal of this paper is to develop a theoretical 
framework where questions such as these can be answered, and initially to 
derive limitations on measurements using basic laws from classical physics. 
Quantum mechanics is not used in this paper. 

The derivation of physical limitations broaden our understanding of con- 
trol engineering, but these limitations are also potentially useful outside of 
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the traditional control-engineering community. In the physics community, 
the rigorous error analysis we provide could help in the analysis of far-from- 
equilibrium systems when time, energy, and degrees of freedom are limited. 
For Micro-Electro-Mechanical Systems (MEMS), the limitation we derive 
on measurements can be of significant importance since the physical scale 
of micro machines is so small. In systems biology, limits on control perfor- 
mance due to molecular implementation have been studied [9]. It is hoped 
that this paper will be a first step in a unified theoretical foundation for 
such problems. 

1.1 Related work 

The derivation of thermodynamics as a theory of large systems which are 
microscopically governed by lossless and time-reversible fundamental laws of 
physics (classical or quantum mechanics) has a large literature and tremen- 
dous progress for over a century within the field of statistical physics. See 
for instance |10H13| for physicists' account of how dissipation can appear 
from time-reversible dynamics, and the books [BHE] on traditional statistical 
physics. In non-equilibrium statistical mechanics, the focus has tradition- 
ally been on dynamical systems close to equilibrium. A result of major 
importance is the fluctuation- dissipation theorem, which plays an important 
role in this paper. The origin of this theorem goes back to Nyquist's and 
Johnson's work [141115) on thermal noise in electrical circuits. In its full 
generality, the theorem was first stated in [16]; see also jl7) . The theorem 
shows that thermal fluctuations of systems close to equilibrium determines 
how the system dissipates energy when perturbed. The result can be used 
in two different ways: By observing the fluctuation of a system you can 
determine its dynamic response to perturbations; or by making small per- 
turbations to the system you can determine its noise properties. The result 
has found wide-spread use in many areas such as fluid mechanics, but also 
in the circuit community, see for example [18pi9[. A recent survey article 
about the fluctuation-dissipation theorem is |20[ . Obtaining general results 
for dynamical systems far away from equilibrium (far-from-equilibrium sta- 
tistical mechanics) has proved much more difficult. In recent years, the so- 
called fluctuation theorem [2T|[22] . has received a great deal of interest. The 
fluctuation theorem quantifies the probability that a system far away from 
equilibrium violates the second law of thermodynamics. Not surprisingly, 
for longer time intervals, this probability is exceedingly small. A surpris- 
ing fact is that the fluctuation theorem implies the fluctuation-dissipation 
theorem when applied to systems close to equilibrium [22]. The fluctuation 
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theorem is not treated in this paper, but is an interesting topic for future 
work. 

From a control theorist's perspective, it remains to understand what 
these results imply in a control-theoretical setting. One contribution of this 
paper is to highlight the importance of the fluctuation-dissipation theorem 
in control engineering. Furthermore, additional theory is needed that is both 
mathematically more rigorous and applies to systems not merely far-from- 
equilibrium, but maintained there using active control. More quantitative 
convergence and error analysis is also needed for systems not asymptoti- 
cally large, such as arise in biology, microelectronics, and micromechanical 
systems. 

Substantial work has already been done in the control community in for- 
mulating various results of classical thermodynamics in a more mathematical 
framework. In [231124] . the second law of thermodynamics is derived and a 
control-theoretic heat engine is obtained (in [25] these results are general- 
ized). In [26], a rigorous dynamical systems approach is taken to derive the 
laws of thermodynamics using the framework of dissipative systems [27|I28] . 
In [29], it is shown how the entropy flows in Kalman-Bucy filters, and in |30j 
Linear-Quadratic-Gaussian control theory is used to construct heat engines. 
In [3Tti33] , the problem of how lossless systems can appear dissipative (com- 
pare with [T0Hl2] above) is discussed using various perspectives. In [M] . 
how the direction of time affects the difficulty of controlling a process is 
discussed. 

1.2 Contribution of tlie paper 

The first contribution of the paper is that we characterize systems that can 
be approximated using linear or nonlinear lossless systems. We develop a 
simple, clear control-theoretic model framework in which the only assump- 
tions on the nature of the physical systems are conservation of energy and 
causality, and all systems are of finite dimension and act on finite time hori- 
zons. We construct high-order lossless systems that approximate dissipative 
systems in a systematic manner, and prove that a linear model is dissipative 
if, and only if, it is arbitrarily well approximated by lossless causal linear 
systems over an arbitrary long time horizon. We show how the error be- 
tween the systems depend on the number of states in the approximation and 
the length of the time horizon (Theorems [1] and [2]). Since human experience 
and technology is limited in time, space, and resolution, there are limits to 
directly distinguishing between a low-order macroscopic dissipative system 
and a high-order lossless approximation. This result is important since it 
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shows exactly what macroscopic behaviors we can implement with linear 
lossless systems, and how many states are needed. In order to approximate 
an active system, even a linear one, with a lossless system, we show that the 
approximation must be nonlinear. Note that active components are at the 
heart of biology and all modern technology, in amplification, digital elec- 
tronics, signal transduction, etc. In the paper, we construct one class of 
low-order lossless nonlinear approximations and show how the approxima- 
tion error depends on the initial available energy (Theorems H] and [5]) . Thus 
in this control-theoretic context, nonlinearity is not a source of complexity, 
but rather an essential and valuable resource for engineering design. These 
result are all of theoretical interest, but should also be of practical inter- 
est. In particular, the results give constructive methods for implementing 
desired dynamical systems using finite number of lossless components when 
resources such as time and energy are limited. 

As a by-product of this contribution, the fluctuation-dissipation theo- 
rem (Propositions [2] and [3]) and the Onsager reciprocal relations (Theo- 
rem [3]) easily follows. The lossless systems studied here are consistent with 
classical physics since they conserve energy. If time reversibility (see |28j 
and also Definition [2]) of the linear lossless approximation is assumed, the 
Onsager relations follow. Uncertainty in the initial state of linear lossless 
approximations give a simple explanation for noise that can be observed at a 
macroscopic level, as quantified by the fiuctuation-dissipation theorem. The 
fluctuation-dissipation theorem and the Onsager relations are well know and 
have been shown in many different settings. Our contribution here is to give 
alternative explanations that use the language and tools familiar to control 
theorists. 

The second contribution of the paper is that we highlight the importance 
of the fluctuation-dissipation theorem for deriving limitations in control the- 
ory. As an application of control-theoretic relevance, we apply it on models 
of measurement devices. With idealized measurement devices that are not 
lossless, we show that measurements can be done without perturbing the 
measured system. We say these measurement devices have no back action, 
or alternatively, no observer effect. However, if these ideal measurement de- 
vices are implemented using lossless approximations, simple limitations on 
the back action that depends on the surrounding temperature and available 
energy emerge. We argue that these lossless measurement devices and the 
resulting limitations are better models of what we can actually implement 
physically. 

We hope this paper is a step towards building a framework for under- 
standing fundamental limitations in control and estimation that arise due 
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to the physical implementation of measurement devices and, eventually, ac- 
tuation. We defer many important and difficult issues here such as how to 
actually model such devices realistically. It is also clear that this framework 
would benefit from a behavioral setting [35]. However, for the points we 
make with this paper, a conventional input-output setting with only regular 
interconnections is sufficient. Aficionados will easily see the generalizations, 
the details of which might be an obstacle to readability for others. Perhaps 
the most glaring unresolved issue is how to best motivate the introduction 
of stochastics. In conventional statistical mechanics, a stochastic framework 
is taken for granted, whereas we ultimately aim to explain if, where, and 
why stochastics arise naturally. We hope to address this in future papers. 
The paper |33j is an early version of this paper. 

1.3 Organization 

The organization of the paper is as follows: In Section El we derive lossless 
approximations of various classes of systems. First we look at memory less 
dissipative systems, then at dissipative systems with memory, and finally at 
active systems. In Section [3l we look at the influence of the initial state of 
the lossless approximations, and derive the fluctuation-dissipation theorem. 
In Section m we apply the results to measurement devices, and obtain limits 
on their performance. 

1.4 Notation 

Most notation used in the paper is standard. Let f{t) G M"^" and fij{t) be 
the (z, j)-th element. Then f{t)'^ denotes the transpose of f{t), and f{t)* the 
complex conjugate transpose of f{t). We define ||/(t)||i := X]"j=i \fij{t)\j 

\\f{t)\\2 := ^J^ljIl\fij{t)\^ , and a{f{t)) is the largest singular value of f{t). 

Furthermore, ||/||Li[o,t] := Jq ||/(s)||ids, and \\f\\L2[o,t] ■= \J !l \\f{s)\\lds. /„ 
is the n-dimensional identity matrix. 

2 Lossless Approximations 

2.1 Lossless systems 

In this paper, linear systems in the form 

x{t) = Jx{t) + Bu{t), x{t) e M", 

y{t) = B^x{t) + Du{t), uit),y{t) eRP, 
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where J and D are anti symmetric (J = — J^, D = —D^) and {J, B) 
is controllable are of special interest. The system ([1]) is a linear lossless 
system. We define the total energy E{x) of ([T]) as 




(2) 



dE{x{t)) 
dt 



x{tf±{t) = y{tYu{t) 



:w{t) 



(3) 



where w{t) is the work rate on the system. If there is no work done on the 
system, w{t) = 0, then the total energy E[x{t)) is constant. If there is work 
done on the system, w{t) > 0, the total energy increases. The work, however, 
can be extracted again, wit) < 0, since the energy is conserved and the 
system is controllable. In fact, all finite-dimensional linear minimal lossless 
systems with supply rate w{t) = y{t)'^u{t) can be written in the form ([T|), 
see [281 Theorem 5]. Nonlinear lossless systems will also be of interest later in 
the paper. They will also satisfy ©-([S]), but their dynamics are nonlinear. 
Conservation of energy is a common assumption on microscopic models in 
statistical mechanics and in physics in general [6]. The systems ([1]) are also 
time reversible if, and only if, they are also reciprocal, see [28^ Theorem 8] 
and also Definitions [TH2] in Section 12.31 Hence, we argue the systems ([1]) 
have desirable "physical" properties. 

Remark 1. In this paper, we only consider systems that are lossless and 
dissipative with respect to the supply rate w(t) = y{t)'^u{t). This supply rate 
is of special importance because of its relation to passivity theory. Indeed, 
there is a theory for systems with more general supply rates, see for example 
I27[\28f . and it is an interesting problem to generalize the results here to 
more general supply rates. 

Remark 2. The system ([7]) is a linear port-Hamiltonian system, see for 
example I36f . with no dissipation. Note that the Hamiltonian of a linear 
port-Hamiltonian system is identical to the total energy E. 

There are well-known necessary and sufficient conditions for when a 
transfer function can be exactly realized using linear lossless systems: All 
the poles of the transfer function must be simple, located on the imaginary 
axis, and with positive semidefinite residues, see |28j . In this paper, we 
show that linear dissipative systems can be arbitrarily well approximated 
by linear lossless systems ([TJ over arbitrarily large time intervals. Indeed, 
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Figure 1: The inductor-capacitor circuit in Example [TJ 



if we believe that energy is conserved, then all macroscopic models should 
be realizable using lossless systems of possibly large dimension. The lin- 
ear lossless systems are rather abstract but have properties that we argue 
are reasonable from a physical point of view, as illustrated by the following 
example. 

Example 1. It is a simple exercise to show that the circuit in Fig. [7] with 
the current i{t) through the current source as input u{t), and the voltage 
vi{t) across the current source as output y{t) is a lossless linear system. We 
have 







l/x/ncT -l/^L^ I x{t) 




{l/VCi O)x(i), 
{VCiVlit) VLiilit) 

lx{tfx{t) = + + C2V2{tf), 



xitf 
E{x{t)) 

w{t)^y{t)u{t)^vi{t)i{t). 



Note that E[x{t)) coincides with the energy stored in the circuit, and that 
w{t) is the power into the circuit. Electrical circuits with only lossless com- 
ponents (capacitors and inductors) can he realized in the form see \37^ . 
Circuits with resistors can always he approximated hy systems in the form 
([iP, as is shown in this paper. 
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2.2 Lossless approximation of dissipative memory less sys- 
tems 

Many times macroscopic systems, such as resistors, are modeled by simple 
static (or memoryless) input-output relations 

y{t) = ku{t), (4) 

where k € M^^'^. If k is positive semidefinite, this system is dissipative 
since work can never be extracted and the work rate is always nonnegative, 
w{t) = y{t)^u{t) = u{t)^ku{t) > 0, for all t and u{t). Hence, (gD is not 
lossless. Next, we show how we can approximate (jH arbitrarily well with 
a lossless linear system ([T]) over finite, but arbitrarily long, time horizons 
[0,r]. First of all, note that k can be decomposed into k = kg + ka where 
kg is symmetric positive semidefinite, and ka is anti symmetric. We can 
use D = ka m the lossless approximation ([T]) and need only to consider the 
symmetric matrix ks next. 

First, choose the time interval of interest, [0, r], and rewrite y{t) = ksu{t) 
as the convolution 

/oo 
K(t - s)n(s)(is, K{t) ■.= ks5{t), (5) 
-oo 

where u{t) is at least continuous and has support in the interval [0,r], 

u{t) = {), t e (-00,0] U [r,oo), 

and 5[t) is the Dirac distribution. The time interval [0, r] should contain all 
the time instants where we perform input-output experiments on the system 
The impulse response Kit) can be formally expanded in a Fourier 
series over the interval [— r, r], 

ks °° ks 

K(t) ~ — ^ -|- > — cosZwot, Wo := vr/r. (6) 

1=1 

To be precise, the Fourier series ^ converges to ks6{t) in the sense of 
distributions. Define the truncated Fourier series by K]\f{t) := ks/{2T) + 
"^j^^ {k s /t) cos lujQt and split K^it) into a causal and an anti-causal part: 

KN{t) =: K%{t) + K'^^it) 

K^(t) = {t<0), K^"(t) = (t>0). 
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The causal part can be realized as the impulse response of a lossless 

linear system ([1]) of order {2N — l)r using the matrices 




J = J^:= 9.N 
-^N _ 

VLn ■■= diag{a;o/r-, 2u;o^r, • • • , (iV - l)wo^r}, 



B = B 



N 



kj 







(7) 



where r = rank ks and kf G M'^'^*' satisfies ks = kjkj. That the series (0) 
converges in the sense of distributions means that for all smooth u{t) of 
support in [0, r] we have that 



ksu{t) 



lim 

7V-s>oo 



{Kf}^{t -s) + K%i{t - s)) u{s)ds. 



A closer study of the two terms under the integral reveals that 



lim 

N^oo 



lim 



- s)u{s)ds = ^ksu{t+), 
i^Nit - s)u{s)ds = l-ksu{t-), 



because of the anti-causal/causal decomposition and = K'^{—t), t > 0. 

Thus since u{t) is smooth, we can also model y{t) = ksu{t) using only the 
causal part if it is scaled by a factor of two. This leads to a linear 

lossless approximation of y{t) = ksu{t) that we denote by the linear operator 
Kn ■■ C^{0,t) C2(0,r) defined by 



VNit) = {KNu){t) 



f 

Jo 



2K%(t — s)u{s)ds 



2K%{t — s)u{s)ds. 



(8) 



Here C^(0, r) denotes the space of twice continuously differentiable func- 
tions on the interval [0, r]. The linear operator K^^ is realized by the triple 
(Jat, \/^B]\f, \/2BJj). We can bound the approximation error as seen in the 
following theorem. 
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Theorem 1. Assume that u € C^(0,r) and u(0) = 0. Let y{t) = ku{t) = 
ksu{t) + kau{t) with kg symmetric positive semidefinite and ka anti symmet- 
ric. Define a lossless approximation with realization {Jn, V^-Bat, y/2Bjf, ka), 
yN{t) = Kj\fu{t) + kau{t). Then the approximation error is bounded as 

2a(k )t 

\\y{t) - yNm2 < ^2(jV-l) (ll'^^*)"^ + 11^(0)112 + ML,[Q,t]) , 
for t in [0, r]. 

Proof. We have that y{t) — yN{t) = YI'i^nC^^s/t) /q* cos Zti;o(t — s)u{s)ds, 
t E [0,r]. The order of summation and mtegration has changed because 
this is how the value of the series is defined in distribution sense. We 
proceed by using repeated integration by parts on each term in the se- 
ries. It holds that Jq cos lujQ{t — s)u{s)ds = [J^ sin/a;o(i — s)'u{s)ds]/{lu!o) = 
[u{t) — ii^O) cos lujQt — Jq cos/a;o(t — s)iL{s)ds]/{PujQ). Hence, we have the 
bound 

+ ||u(0)||2+ / \\uis)\\ids). 
Jo 

Since X^^jy ^/^'^ — establish the bound in the theorem. □ 

The theorem shows that by choosing the truncation order sufficiently 
large, the memoryless model @ can be approximated as well as we like 
with a lossless linear system, if inputs are smooth. Hence we cannot then 
distinguish between the systems y = ku and y^ = Kj\fU + kaU using finite- 
time input-output experiments. On physical grounds one may prefer the 
model K^ + ka even though it is more complex, since it assumes the form ([1]) 
of a lossless system (and is time reversible if k is reciprocal, see Theorem [3]). 
Additional support for this idea is given in Section [3l Note that the lossless 
approximation Kjsf is far from unique: The time interval [0, r] is arbitrary, 
and other Fourier expansions than ([6]) are possible to consider. The point is, 
however, that it is always possible to approximate the dissipative behavior 
using a lossless model. 

It is often a reasonable assumption that inputs u{t), for example voltages, 
are smooth if we look at a sufficiently fine time scale. This is because we 
usually cannot change inputs arbitrarily fast due to physical limitations. 
Physically, we can think of the approximation order (2iV — l)r as the number 
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of degrees of freedom in a physical system, usually of the order of Avogadro's 
number, N ~ 10^^. It is then clear that the interval length r can be very 
large without making the approximation error bound in Theorem [1] large. 
This explains how the dissipative system @ is consistent with a physics 
based on energy conserving systems. 

Remark 3. Note that it is well known that a dissipative memoryless system 
can he modeled by an infinite-dimensional lossless system. We can model 
an electrical resistor by a semi-infinite lossless transmission line using the 
telegraphists's equation (the wave equation), see 138^ . for example. If the 
inductance and capacitance per unit length of the line are L and C , re- 
spectively, then the characteristic impedance of the line, -sjLjC , is purely 
resistive. One possible interpretation of is as a finite-length lossless 
transmission line where only the N lowest modes of the telegraphists 's equa- 
tion are retained. Also in the physics literature lossless (or Hamiltonian) 
approximations of dissipative memoryless systems can be found. In IKMl^f . 
a so-called Ohmic bath is used, for example. Note that it is not shown in 
these papers when, and how fast, the approximation converges to the dissi- 
pative system. This is in contrast to the analysis presented herein, and the 
error bound in TheoremUi 

2.3 Lossless approximation of dissipative systems with mem- 
ory 

In this section, we generalize the procedure from Section 12.21 to dissipa- 
tive systems that have memory. We consider asymptotically stable time- 
invariant linear causal systems G with impulse response g{t) E M^^^. Their 
input-output relation is given by 

y{t) = {Gu){t) = f g{t - s)u{s)ds. (9) 
Jo 

Possible direct terms in G can be approximated separately as shown in 
Section 12.21 The system ([9]) is dissipative with respect to the work rate 
w{t) = y{t)'^u{t) if and only if y{t)'^u{t)dt > 0, for all r > and admis- 
sible u{t). An equivalent condition, see |28j . is that the transfer function 
satisfies 

g{juj)+g{-jujf >0 for all u. (10) 

Here g{juj) is the Fourier transform of g{t). 

We will next consider the problem of how well, and when, a system Q 
can be approximated using a linear lossless system ([T]) (call it Gn) with 
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fixed initial state xq, 



Jo 



(11) 



for a set of input signals. Let us formalize the problem. 

Problem 1. For any fixed time horizon [0,r] and arbitrarily small e > 0, 
when is it possible to find a lossless system with fixed initial state xq and 
output yjv such that 



for all input signals u £ L2[0,t] and < t < t? 

Note that we require xq to be fixed in Problem[Tl so that it is independent 
of the applied input u{t). This means the approximation should work even 
if the applied input is not known beforehand. Let us next state a necessary 
condition for linear lossless approximations. 

Proposition 1. Assume there is a linear lossless system Gjy that solves 
Problem^ Then it holds that 

(i) If Xq ^0, then xq is an unobservable state; 

(a) If xq ^ 0, then xq is an uncontrollable state; and 

(Hi) If the realization of Gn is minimal, then xq = 0. 

Proof, (i): The inequality ()12p holds for u = when y = 0. Then ()12p 
reduces to ||y7v(t)||2 < 0, for t G [0, r], which implies ynit) = B^c^^xq = 0. 
Thus a nonzero xq must be unobservable. (ii): For the lossless realizations it 
holds that N{0) = n{0'^)^ = TZ{C)-^, where O and C are the observability 
and controllability matrices for the realization {J,B,B^). Thus if xq is 
unobservable, it is also uncontrollable, (iii): Both (i) and (ii) imply (iii). □ 

Proposition [1] significantly restricts the classes of systems G we can ap- 
proximate using linear lossless approximations. Intuitively, to approximate 
active systems there must be energy stored in the initial state of G^. But 
Proposition [T] says that such initial energy is not available for the inputs and 
outputs of Gat. The next theorem shows that we can approximate G using 
Gn if, and only if, G is dissipative. 




(12) 
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Theorem 2. Suppose G is a linear time-invariant causal system where 
\\g{t)\\2 is uniformly bounded, g{t) G Li n L2(0, cxd), and g{t) G Li(0,oo). 
Then Problem [7] is solvable using a linear lossless Gn if, and only if, G is 
dissipative. 

Proof. See Appendix 16. 11 □ 

The proof of Theorem [2] shows that the number of states needed in 
Gat is proportional to r/e^, and again the required state space is large. The 
result shows that for finite-time input-output experiments with finite-energy 
inputs it is not possible to distinguish between the dissipative system and 
its lossless approximations. Theorem [2] illustrates that a very large class 
of dissipative systems (macroscopic systems) can be approximated by the 
lossless linear systems we introduced in ([1]). The lossless systems are dense 
in the dissipative systems, in the introduced topology. Again this shows how 
dissipative systems are consistent with a physics based on energy-conserving 
systems. 

In |28| Theorem 8] , necessary and sufficient conditions for time reversible 
systems are given. We can now use this result together with Theorem [2] to 
prove a result reminiscent to the Onsager reciprocal relations which say 
physical systems tend to be reciprocal, see for example [6]. Before stating 
the result, we properly define what is meant by reciprocal and time reversible 
systems. These definitions are slight reformulations of those found in |28j . 

A signature matrix Sg is a diagonal matrix with entries either -|-1 and 

-1. 

Definition 1. A linear time-invariant system G with impulse response g{t) 
is reciprocal with respect to the signature matrix Eg ifT>eg{t) = g{t)'^'Ee. 

Definition 2. Consider a finite- dimensional linear time-invariant system G 
and assume that x{0) = 0. Let ui,U2 be admissible inputs to G, and yi,y2 
be the corresponding outputs. Then G is time reversible with respect to the 
signature matrix Sg if y2{t) = ^eVii—t) whenever U2{t) = —T,eUi{—t). 

Theorem 3. Suppose G satisfies the assumptions in Theorem [H Then G 
is dissipative and reciprocal with respect to Eg if, and only if, there exists a 
time-reversible (with respect to T,e) arbitrarily good linear lossless approxi- 
mation Gn ofG. 

Proof. See Appendix 16. 2[ □ 

Hence, one can understand that macroscopic physical systems close to 
equilibrium usually are reciprocal because their underlying dynamics are 
lossless and time reversible. 
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Remark 4. There is a long-standing debate in physics about how macro- 
scopic time-irreversible dynamics can result from microscopic time-reversible 
dynamics. The debate goes back to Loschmidt's paradox and the Poincare 
recurrence theorem. The Poincare recurrence theorem says that bounded 
trajectories of volume-preserving systems (such as lossless systems) will re- 
turn arbitrarily close to their initial conditions if we wait long enough (the 
Poincare recurrence time). This seems counter-intuitive for real physical 
systems. One common argument is that the Poincare recurrence time for 
macroscopic physical systems is so long that we will never experience a recur- 
rence. But this argument is not universally accepted and other explanations 
exist. The debate still goes on, see for example J13^ . In this paper we con- 
struct lossless and time-reversible systems with arbitrarily large Poincare 
recurrence times, that are consistent with observations of all linear dissi- 
pative (time-irreversible) systems, as long as those observations take place 
before the recurrence time. For a control- oriented related discussion about 
the arrow of time, see 

2.4 Nonlinear lossless approximations 

In Section \2.2\ it was shown that a dissipative memoryless system can be 
approximated using a lossless linear system. Later in Section 12.31 it was also 
shown that the approximation procedure can be applied to any dissipative 
(linear) system. Because of Proposition [1] and Theorem [21 it is clear that it 
is not possible to approximate a linear active system using a linear lossless 
system with fixed initial state. Next we will show that it is possible to solve 
Problem [1] for active systems if we use nonlinear lossless approximations. 
Consider the simplest possible active system, 

y{t) = knit), (13) 

where k G MP^^ is negative definite. This can be a model of a negative 
resistor, for example. More general active systems are considered below. 
The reason a linear lossless approximation of (I13p cannot exist is that the 
active device has an internal infinite energy supply, but we cannot store 
any energy in the initial state of a linear lossless system and simultaneously 
track a set of outputs, see Proposition [TJ However, if we allow for lossless 
nonlinear approximations, (113p can be arbitrarily well approximated. This 
is shown next by means of an example. 
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Consider the nonlinear system 



±E{t) = -j==u{tfku{t), xe{0) = Eo > 0, 

^ ° (14) 
VEit) = ^^kn{t), 

with a scalar energy-supply state XE{t), and total energy E{xe) = ^^'e- 
The system (|14p has initial total energy ^X£;(0)^ =: Eq, and is a lossless 
system with respect to the work rate w{t) = yE{t)u{t), since 

j^E{xE{t)) = XE{t)xE{t) = yE{tfu{t). 



The input-output relation of (jl4p is given by 

I ft 

XE{t) = ^2Eq + ^= / u{sfku{s)ds, 



1 /■* 
IJEit) = ku{t) + —-^ku{t) I u{s)'^ku{s)ds. 



(15) 



We have the following approximation result. 

Theorem 4. For uniformly bounded inputs, \\u{t)\\2 < u, t £ [0,r], the error 
between the active system JTgj) and the nonlinear lossless approximation [T4\ ) 
can be bounded as 

WVEit) - y{t)\\2 < e\\u\\L2[o^t], 

for t e [0,t], where e = a{kfu^y/^/{2EQ). 

Proof. A simple bound on yE{t) — ku{t) from (fT5]) gives \\yE{i) — 2/(0 lb < 
^^''^'ff^"' /o \\u{s)\\lds. Then using ||ti(t)||2 < u, t e [0,t], gives the result. 

□ 

The error bound in Theorem [4] can be made arbitrarily small for finite 
time intervals if the initial total energy Eq is large enough. This example 
shows that active systems can also be approximated by lossless systems, if 
the lossless systems are allowed to be nonlinear and are charged with initial 
energy. 

The above approximation method can in fact be applied to much more 
general systems. Consider the ordinary differential equation 

x(t) = /(x(t),n(t)), x(0) = xo, 
y{t)= g{x{t),u{t)), 
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where x{t) G M", and u{t),y{t) G R^. In general, this is not a lossless 
system with respect to the supply rate w{t) = y{t)'^u{t). A nonlinear lossless 
approximation of (jl6p is given by 



XEit) = -^^gixit),<t)f<i) - ^^^S:{tffim,<t)), (17) 

yE{t) = ^gixit),u{t)), XEm = y^o, 

where again XE(t) is a scalar energy-supply state, and x{t) £ can be 
interpreted as an approximation of x{t) in ()16p . That ()17p is lossless can be 
verified using the storage function 

E = hitfx{t) + ^XEitf, 

since 



E = {xe/\/2Eq){xF f{x, u) + 5f(x, ufu - u)) 
= [xeI ^/2Eo)g{x, ufu = y'^u = w. 

Since XE{t)/^2EQ ~ 1 for small t, it is intuitively clear that x[t) in ()17p will 
be close to x{t) in (jl6p . at least for small t and large initial energy Eq. We 
have the following theorem. 

Theorem 5. Assume that df /dx is continuous with respect to x and t, and 
that has a unique solution x{t) for < t < t. Then there exist positive 
constants Ci and Ei such that for all Eq > Ei ( f j7| j has a unique solution 
x(t) which satisfies \\x{t) — x{t)\\2 < Ci/\/2Eq for all < t < t. 

Proof. Introduce the new coordinate Axe = xe — V^Eq and define eg := 
l/^/2E(). The system ()17p then takes the form 

X = (1 + eoAx£;)/(x,n), x{0) = xq, 
Axe = tog{x, ufu - eoSF' f{x, u), Axe{^) = 0. 

Perturbation analysis [391 Section 10.1] in the parameter eo as eo — )• yields 
that there are positive constants ei and Ci such that ||x — x\\2 < Cileol for 
aU |eo| < ei. The result then follows with Ei = l/(2ef). □ 
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Just as in Section 12. 3|, the introduced lossless approximations are not 
unique. The one introduced here, (jl7p . is very simple since only one extra 
state xe is added. Its accuracy (Ci, Eq) of course depends on the particular 
system (/, g) and the time horizon r. An interesting topic for future work 
is to develop a theory for "optimal" lossless approximations using a fixed 
amount of energy and a fixed number of states. 

2.5 Summary 

In Section[21 we have seen that a large range of systems, both dissipative and 
active, can be approximated by lossless systems. Lossless systems account 
for the total energy, and we claim these models are more physical. It was 
shown that linear lossless systems are dense in the set of linear dissipative 
systems. It was also shown that time reversibility of the lossless approx- 
imation is equivalent to a reciprocal dissipative system. To approximate 
active systems nonlinearity is needed. The introduced nonlinear lossless 
approximation has to be initialized at a precise state with a large total en- 
ergy (^'o)- The nonlinear approximation achieves better accuracy (smaller 
e) by increasing initial energy (increasing £"0). This is in sharp contrast to 
the linear lossless approximations of dissipative systems that are initialized 
with zero energy (Eq = 0). These achieve better accuracy (smaller e) by 
increasing the number of states (increasing N). The next section deals with 
uncertainties in the initial state of the lossless approximations. 

3 The Fluctuation-Dissipation Theorem 

As discussed in the introduction, the fluctuation-dissipation theorem plays 
a major role in close-to-equilibrium statistical mechanics. The theorem has 
been stated in many different settings and for different models. See for 
example |17tl20j. where it is stated for Hamiltonian systems and Langevin 
equations. In |18U19j. it is stated for electrical circuits. A fairly general form 
of the fluctuation-dissipation theorem is given in [6l p. 500]. We re-state this 
version of the theorem here. 

Suppose that yi and Uj, i = 1, . . . ,p, are conjugate external variables 
(inputs and outputs) for a dissipative system in thermal equilibrium of tem- 
perature T [Kelvin] (as defined in Section IXTI) . We can interpret i/i as a 
generalized velocity and Ui as the corresponding generalized force, such that 
yiUi is a work rate [Watt]. Although the system is generally nonlinear, we 
only consider small variations of the state around a fixpoint of the dynamics. 
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which ahows us to assume the system to be hnear. Assume first that the sys- 
tem has no direct term (no memoryless element). If we make a perturbation 
in the forces n, the velocities y respond according to 

yit) = git - s)u{s)ds, 



JO 

where g{t) £ MP^^ is the impulse response matrix by definition. The follow- 
ing fluctuation-dissipation theorem now says that the velocities y actually 
also fluctuates around the equilibrium. 

Proposition 2. The total response of a linear dissipative system G with no 
memoryless element and in thermal equilibrium of temperature T is given 
by ^ 

yit) = n{t) + [ g{t- s)u{s)ds, (18) 
Jo 

for perturbations u. The fluctuations n{t) £ W is a stationary Gaussian 
stochastic process, where 

Bn{t) = 0, 

Rnit,s) ■.= En{t)n{s)'^ 

f 19) 
_{kBTg{t- s),t- s>^ 

~ \ kBTg{s - t)^, t - s < 0, 
where ks is Boltzmann's constant. 

Proof. See Section [XT! □ 

The covariance function of the noise n is determined by the impulse re- 
sponse g, and vice versa. The result has found wide-spread use in for exam- 
ple fluid mechanics: By empirical estimation of the covariance function we 
can estimate how the system responds to external forces. In circuit theory, 
the result is often used in the other direction: The forced response deter- 
mines the color of the inherent thermal noise. One way of understanding the 
fluctuation-dissipation theorem is by using linear lossless approximations of 
dissipative models, as seen in the next subsection. 

We may also express (jlSp in state space form in the following way. A 
dissipative system with no direct term can always be written as [28\ Theo- 
rem 3]: 

x{t) = {J-K)x{t) + Bu{t), 
y{t) = B^x{t), 
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where K = is positive semidefinite and J anti symmetric. To account 
for (jl8p - (jl9p . it suffices to introduce a white noise term v{t) in (j20p in the 
following way, 

±{t) = (J - K)x{t) + Bu{t) + ^/2kBTLv{t), 

y{t) = B^x{t), ^^^^ 

where the matrix L is chosen such that LL"^ = K. Equation ()2ip is the 
called the Langevin equation of the dissipative system. 

Dissipative systems with memoryless elements are of great practical sig- 
nificance. Proposition [2] needs to be slightly modified for such systems. 

Proposition 3. The total response of a linear dissipative memoryless sys- 
tem in thermal equilibrium of temperature T and for perturbations u is given 
by 

y{t) = n{t) + ku{t) = n{t) + ksu{t) + kau{t), (22) 

where kg > is symmetric positive semidefinite, and ka anti symmetric. 
The fluctuations n{t) is a white Gaussian stochastic process, where 

En(t) = 0, 

Rn{t, s) := En(t)n(s)^ = 2kBTks5{t - s). 

Proposition [3] follows from Proposition [2] if one extracts the dissipative 
term ksu{t) from the memoryless model ku{t) and puts g{t) = ks6{t). How- 
ever, the integral in (jlSp runs up to s = t and cuts the impulse 6{t) in half. 
The re-normalized impulse response of the dissipative term is therefore given 
by g{t) = 2ks6{t) (see also Section [2.2p . The result then follows using this 
g{t) by application of Proposition [2l One explanation for why the anti sym- 
metric term ka can be removed from g{t) is that it can be realized exactly 
using the direct term D in linear lossless approximation ([T]). An application 
of Proposition [3] gives the Johnson- Nyquist noise of a resistor. 

Example 2. As first shown theoretically in 115^ and experimentally in 114^ , 
a resistor R of temperature T generates white noise. The total voltage over 
the resistor, v{t), satisfies v{t) = Ri{t) -\-n[t), En(t)n(s) = 2kBTR5{t — s), 
where i{t) is the current. 

3.1 Derivation using linear lossless approximations 

Let us first consider systems without memoryless elements. The general 
solution to the linear lossless system ([1]) is then 

y{t) = S V'xo + f B'^e-^^^~'^Bu{s)ds, (23) 
Jo 



20 



where xq is the initial state. It is the second term, the convolution, that ap- 
proximates the dissipative {Gu){t) in the previous section. In Proposition [H 
we showed that the first transient term is not desired in the approximation. 
Theorems [J and [2] suggest that we will need a system of extremely high order 
to approximate a linear dissipative system on a reasonably long time hori- 
zon. When dealing with systems of such high dimensions, it is reasonable 
to assume that the exact initial state xq is not known, and it can be hard 
to enforce xq = 0. Therefore, let us take a statistical approach to study its 
influence. We have that 



Ey(t) 
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if the input u{t) is deterministic and E is the expectation operator. The 
autocovariance function Ry for y{t) is then 



Ry{t,s) :=E[2/(t)-E2/(t)][y(s)-Ey(s)]^ 



where Xq is the covariance of the initial state, 

Xo := EAxqAx^, (25) 

where Axq := xq — Exq is the stochastic uncertain component of the initial 
state, which evolves as Ax{t) = c'^^Axq. The positive semidefinite matrix 
Xq can be interpreted as a measure of how well the initial state is known. 
For a lossless system with total energy E{x) = ^x'^x we define the internal 
energy as 

U{x) := ^Ax'^Ax, Ax ■.= x- Ex. (26) 

The expected total energy of the system equals EE{x) = ^(Ex)'^'Ex + 
EC/(x). Hence the internal energy captures the stochastic part of the total 
energy, see also p5l[30]. In statistical mechanics, see [SHI], the temperature 
of a system is defined using the internal energy. 

Definition 3 (Temperature). A system with internal energy U{x) [Joule] 
has temperature T [Kelvin] if, and only if, its state x belongs to Gibbs's 
distribution with probability density function 

p(x) = ^exp[-U{x)/kBT], (27) 

where ks is Boltzmann's constant and Z is the normalizing constant called 
the partition function. A system with temperature is said to be at thermal 
equilibrium. 
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When the internal energy function is quadratic and the system is at ther- 
mal equilibrium, it is well known that the uncertain energy is equipartitioned 
between the states, see [H Sec. 4-5]. 

Proposition 4. Suppose a lossless system with internal energy function 
U{x) = ^Ax^Ax has temperature T at time t = 0. Then the initial state 
xq belongs to a Gaussian distribution with covariance matrix Xq = ksTIn, 
and BU{xo) = ^ksT. 

Hence, the temperature T is proportional to how much uncertain equipar- 
titioned energy there is per degree of freedom in the lossless system. There 
are many arguments in the physics and information theory literature for 
adopting the above definition of temperature. For example, Gibbs's distri- 
bution maximizes the Shannon continuous entropy (principle of maximum 
entropy |40p41j). In this paper, we will simply accept this common defini- 
tion of temperature, although it is interesting to investigate more general 
definitions of temperature of dynamical systems. 

Remark 5. Note that lossless systems may have a temperature at any time 
instant, not only at t = 0. For instance, a lossless linear system (23\) of 
temperature Tatt = that is driven by a deterministic input remains at the 
same temperature and has constant internal energy at all times, since Ax{t) 
is independent of u{t). To change the internal energy using deterministic 
inputs, nonlinear systems are needed as explained in 123[ \24^. For the related 
issue of entropy for dynamical systems, see f23 [ [25 ^ . 

If a lossless linear system (j23p has temperature T at t = as defined in 
Definition [3] and Proposition [U then the autocovariance function (f24ll takes 
the form 

Ry{t, s) = ksT . B^e'^'-'^B = ksT ■ [B^e'^'-'^ Bf , 

since 

jT = _ J. It 

is seen that linear lossless systems satisfy the fluctuation- 
dissipation theorem (Proposition [2|) if we identify the stochastic transient 
in ([23]) with the fluctuation, i.e. n{t) = B'^c^^xq (assuming Exq = 0), and 
the impulse response as g{t) = B^e^^B. In particular, n{t) is a Gaussian 
process of mean zero because xq is Gaussian and has mean zero. 

Theorem [2] showed that dissipative systems with memory can be arbi- 
trarily well approximated by lossless systems. Hence we cannot distinguish 
between the two using only input-output experiments. One reason for pre- 
ferring the lossless model is that its transient also explains the thermal 
noise that is predicted by the fluctuation-dissipation theorem. To explain 
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the fluctuation-dissipation theorem for systems without memory (Proposi- 
tion [3]), one can repeat the above arguments by making a lossless approxi- 
mation of ks (see Theorem [1]) . The anti symmetric part ka does not need 
to be approximated but can be included directly in the lossless system by 
using the anti symmetric direct term D in (|12p . 

Proposition [3] captures the notion of a heat hath, modelling it (as de- 
scribed in Theorem [1]) with a lossless system so large that for moderate 
inputs and within the chosen time horizon, the interaction with its environ- 
ment is not significantly affected. 

That the Langevin equation (j2ip is a valid state-space model for (jlSp is 
shown by a direct calculation. If we assume that (j20p is a low-order approx- 
imation for a high-order linear lossless system (j23p , in the sense of Theorem 
[21 it is enough to require that both systems are at thermal equilibrium with 
the same temperature T in order to be described by the same stochastic 
equation (|18p . at least in the time interval in which the approximation is 
valid. 

3.2 Nonlinear lossless approximations and thermal noise 

Lossless approximations are not unique. We showed in Section 12.41 that 
low-order nonlinear lossless approximations can be constructed. As seen 
next, these do not satisfy the fluctuation-dissipation theorem. This is not 
surprising since they can also model active systems. If they are used to 
implement linear dissipative systems, the linearized form is not in the form 
([l]). By studying the thermal noise of a system, it could in principle be 
possible to determine what type of lossless approximation that is used. 

Consider the nonlinear lossless approximation (I14p of y{t) = ku{t), where 
k is scalar and can be either positive or negative. The approximation only 
works well when the initial total energy Eq is large. To study the effect of 
thermal noise, we add a random Gaussian perturbation Axq to the initial 
state so that the system has temperature T at f = according to Definition [3] 
and Proposition m This gives the system 



XE{t) 



k 



{tf, Xi5(0) = v^ + Axo, EAxo = 



\/2^ 



u 



k 



(28) 



VEit) 



XE{t)u{t), EAxl = ksT. 



The solution to the lossless approximation (I28p is given by 



yE{t) = ku{t)+n,{t)+nd{t) 



(29) 
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nd{t) = ^u{t) [ u{sfds, ns{t) = ^^^u{t). (30) 



where 

'^^ ^ 2£'o ^ Jo ^ ^ ' ^ \/2£'o 

We call nd{t) the deterministic implementation noise and ns{t) the stochastic 
thermal noise. The ratio between the deterministic and stochastic noise is 



ndjt) _ k /'^.,„^2^„_ ku{0)^ 



Jo V2-BoAxo 



as i — 7- 0, if u{t) is continuous. Hence, for sufficiently small times t and if 
Axo 7^ 0, the stochastic noise ns{t) is the dominating noise in the lossless 
approximation (128p . Since Axq belongs to a Gaussian distribution, there is 
zero probability that Axq = 0. Hence, the solution yE{i) can be written 

VEit) = ku{t) + risit) + Oit), 

En,(t) = 0, En,(t)2 = —^u{tf. 

Just as in Proposition [3l the noise variance is proportional to the tempera- 
ture T. Notice, however, that the noise is significantly smaller in ()3ip than 
in Proposition [3l There the noise is white and unbounded for each t. The 
expression ([3T]) is further used in Section HI 



3.3 Summary 

In Section [21 we have seen that uncertainty in the initial state of a linear loss- 
less approximation gives a simple explanation for the fluctuation-dissipation 
theorem. We have also seen seen that uncertainty in the initial state of a 
nonlinear lossless approximation gives rise to noise which does not satisfy 
the fluctuation-dissipation theorem. In all cases, the variance of the noise is 
proportional to the temperature of the system. Only when the initial state 
is perfectly know, that is when the system has temperature zero, perfect 
approximation using lossless systems can be achieved. 



4 Limits on Measurements and Back Action 

In this section, we study measurement strategies and devices using the de- 
veloped theory. In quantum mechanics, the problem of measurements and 
their interpretation have been much studied and debated. Also in classical 
physics there have been studies on limits on measurement accuracy. Two 
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examples are |42t l43|. where thermal noise in measurement devices is ana- 
lyzed and bounds on possible measurement accuracy derived. Nevertheless, 
the effect of the measurement device on the measured system, the "back ac- 
tion" , is usually neglected in classical physics. That such effects exist also in 
classical physics is well known, however, and is called the "observer effect" . 
Also in control engineering these effects are usually neglected: The sensor 
is normally modeled to interact with the controlled plant only through the 
feedback controller. 

Using the theory developed in this paper, we will quantify and give limits 
on observer effects in a fairly general setting. These limitations should be 
of practical importance for control systems on the small physical scale, such 
as for MEMS and in systems biology. 

4.1 Measurement problem formulation 

Assume that the problem is to estimate the scalar potential y{tm) (an out- 
put) of a linear dissipative dynamical system S at some time tm > 0. Fur- 
thermore, assume that the conjugate variable of ?/ is u (the "flow" variable). 
Then the product y{t)u{t) is a work rate. As has been shown in Section [2.31 
all single-input-single-output linear dissipative systems can be arbitrarily 
well approximated by a dynamical system in the form, 

^ (±it) = Jxit) + Bu{t), x{0)=xo, 

\ yit) = B'^x{t), y(0) = yo = B^x^, 

for a fixed initial state xq. Note that this system evolves deterministically 
since xq is fixed. Let us also define the parameter C by B =: 1/C. Then 
1/C is the first Markov parameter of the transfer function of S. If S is an 
electrical capacitor and the measured quantity a voltage, C coincides with 
the capacitance. 

To estimate the potential y{tjn), an idealized measurement device called 
M is connected to S in the time interval [0, tm]; see Fig. [2l The validity of 
Kirchoff 's laws is assumed in the interconnection. That is, the fiow out of S 
goes into Ai, and the potential difference y(t) over the devices is the same 
(a lossless interconnection). The device M has an ideal fiow meter that 
gives the scalar value Um{t) = —u{t). Therefore the problem is to estimate 
the potential of S given knowledge of the fiow u{t). For this problem, two 
related effects are studied next, the hack action b{tm), and the estimation 
error e{tm)- By back action we mean how the interconnection with 
effects the state of 5. It quantifies how much the state of S deviates from its 
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Figure 2: Circuit diagram of an idealized measurement device M and the 
measured system S. The measurement is performed in the time interval 
[0,tm]- The problem is to estimate the potential y(tm) as well as possible, 
given the flow measurement Um = —u. 
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Figure 3: Circuit diagrams of the memoryless dissipative measurement de- 
vice Ml (left) and the memoryless active measurement device M2 (right). 



natural trajectory after the measurement. Estimation error is the difference 
between the actual potential and the estimated potential. Next we consider 
two measurement strategies and their lossless approximations in order to 
study the impact of physical implementation. 

Remark 6. The reason the initial state xq in S is fixed is that we want to 
compare how different measurement strategies succeed when used on exactly 
the same system. We also assume that yo = B^xq is completely unknown 
to the measurement device before the measurement starts. 
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4.2 Memoryless dissipative measurement device 

Consider the measurement device A^i to the left in Fig. [31 This measure- 
ment device connected to S is modeled by a memoryless system with (a 
known) admittance km > 0, 

{Um{t) = -U{t) = kmVit) 
Vrait) = — = y{t). 
"'in 

The signal ym{t) is the measurement signal produced by A4i. The dynamics 
of the interconnected measured system becomes 



SMi 



xi{t) = (J - kmBB^)xi{t), XI (0) = xo, 
yi{t) = B^xiit), 



where xi(t) is the state of S when it is interconnected to A^i. If the mea- 
surement circuit is closed in the time interval [0, tm], then the state of the 
system S gets perturbed from its natural trajectory by a quantity 



"771/ ' 



where x{t) satisfies (j32p with u{t) = 0, and b{tm) is the back action. By mak- 
ing the measurement time tm small, the back action can be made arbitrarily 
small. 

In this situation, a good estimation policy for the potential yi(tm) is to 
choose y(tm) = ym(im), since the estimation error e{tm) is identically zero 
in this case, 

e{tm) ■■= y{tm) - yi{tm) = 0. 

The signal y{tm) should here, and in the following, be interpreted as the best 
possible estimate of the potential of S for someone who has access to the 
measurement signal ym{t), < t < tm- Note that the estimation error e is 
defined with respect to the perturbed system SMi. Given that we already 
have defined back action it is easy to give a relation to the unperturbed 
system S by 

y{tm) = y{tm) - e{tm) " B^b{tm), (33) 

which is valid for non-zero estimation errors also. 
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Remark 7. Whether one is interested in the perturbed potential yi{tm) or 
the unperturbed potential y{tm) of S depends on the reason for the measure- 
ment. For a control engineer who wants to act on the measured system, 
yi{im) is likely to be of most interest. A physicist, on the other hand, who 
is curious about the uncontrolled system may be more interested in y{tm)- 
Either way, knowing the back action b, one can always gety{tm) from yi(tm) 
using (3^) . 



4.2.1 Lossless realization Mi 

Next we make a linear lossless realization of the admittance km > in 
A^i, using Proposition [3l so that it satisfies the fluctuation-dissipation the- 
orem. Linear physical implementations of A^i inevitably exhibit this type 
of Johnson-Nyquist noise. We obtain 



Um{t) = -U{t) = kmy{t) + \/2k,nkBTran{t), 

ym{t) = — — = y{t) + W— n{t), 



""m V 



where Tm is the temperature of the measurement device, and n{t) is unit- 
intensity white noise. As shown before, the noise can be interpreted as due 
to our ignorance of the exact initial state of the measurement device. The 
interconnected measured system SXii satisfies a Langevin-type equation. 



SMi : { 



xi{t) = {J- kmBB^)xi{t) - ^/2k„,kBTmBn{t), 
xi(0) = xo, 
t yi{t) = B'^xi{t). 



The solution for SMi is 



/* 

Jo 
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The back action can be calculated as 

b{tm) = Xi{tm) - x{tm) = bd{tm) + bs{tm), 
bd{trn) ■■= Exi(t„) - X{tra) = e^'^-''^^^^^'^ Xq - C-^'^Xq 

bs{tm) ■■= Xi{tm) - Exi{tm) 

^ _ / ^(J-k^BB ){t^-s)B^2kmkBTn,n{s)ds, 
Jo 

where we have split the back action into deterministic and stochastic parts. 
The deterministic back action coincides with the back action for A4i. The 
stochastic back action comes from the uncertainty in the lossless realization 
of the measurement device. The measurement device Mi injects a stochastic 
perturbation into the measured system S. 

The covariance P of the back action b at time is 

P(tm) := E[6(U,) - Eb{t^)][b{tm) - Bb{tm)f 

Jo 

X i3^(e(^"'=-^^^)(*"-«))^ds = 2BB''kmkBT^tm + 0{tl). (34) 

It holds that P{tm) ksTmln and 'EiXi{t) — )• as — >• co, see [301 
Propositions 1 and 2], and the measured system attains temperature Tm 
after an infinitely long measurement. It is therefore reasonable to keep tm 
small if one wants to have a small back action. 

Next we analyze and bound the estimation error. The measurement 
equation is given by 

Km V '^m 

Note that y{tm) = ymi^m) is now a poor estimator of yi{tm), since the 
variance of the estimation error e{t) = y{t) — yi{t) is infinite due to the white 
noise n{t). Using filtering theory, we can construct an optimal estimator 
that achieves a fundamental lower bound on the possible accuracy (minimum 
variance) given ym.{i) in the interval < t < tm- The solution is the Kalman 
filter, 

ii(t) = (J - kmBB^)xi{t) + K{t)[ym{t) - B^xi{t)l 
y(t) = i?^£i(t), 
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where K{t) is the Kalman gain (e.g. [Sj). The minimum possible variance of 
the estimation error, M*{tm) = minE(y(tm) — yi{tm)Y (* denotes optimal) 
can be computed from the differential Riccati equation 

X{t) = Jk^X{t) + X{t)jl^ + 2kmkBTmBB^ 

kn 



-{X{t)-2kBTmIn)B 



2kBTrr 

X B^{X{t) - 2kBTM^, (36) 

M*{tm) = B^X{tm)B, Jk^ :=J- kmBB^. 

A series expansion X{t) = jX^i + Xq + tXi + . . . of the solution to (pHI) 
yields that the coefficient X_i should satisfy X_i = ^.^Tl, X^iBB^X^i. 
Note that X_i is independent on Jfc„. From the Xi equation, we yield that 

M*(U = ^M^ + 0(1), 

since M*{t) = \B^X_iB + B^XqB + tB^XiB + ... Here the boundary 
condition M*(0) = +00 has been used, since it is assumed that uq is com- 
pletely unknown, see Remark [6l It is easy to verify that M*{tm) — t- as 

—7- 00, and given an infinitely long measurement a perfect estimate is 
obtained. This comes at the expense of a large back action. 

To implement the Kalman filter (I35p requires a complete model (J, B, km,T„ 
which is not always reasonable to assume. Nevertheless, the Kalman filter 
is optimal and the variance of the estimation error, M{t) := Ee(t)^, of 
any other estimator, in particular those that do not require complete model 
knowledge, must satisfy 

M(t„) > M*{tm) = + 0(1). (37) 

4.2.2 Back action and estimation error trade-ofF 

Define the root mean square back action and the root mean square estima- 
tion error of the potential y by 



\Ay{t^)\ := ^BTp{t^)B, \Ay{tm)\ := Vmt^. 

This is the typical magnitude of the change of the potential y and the es- 
timation error after a measurement. Using (j34p and (|37p . the appealing 
relation 

\Ay{t^)\\Ay{t^)\ > 2kBT^/C + 0{tm), (38) 
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Table 1: Summary of back action and estimation error after a measurement 
in the time interval [0,tm]- bd{tm) - deterministic back action, P{tm) - 
covariance of back action, |Ayp - variance of potential, and M*{tm) - lower 
bound on estimation error. 



Device 


bd{tm) 


P{tm) — ^bs{tm)bs{'tmY 


\^y{tm)\'' 


= B'P{t^)B 


M*{tm) 


= min 


Ml 
Ml 


-kmyoBtm + 0(tm) 
-kmyoBtm + 0{tl^) 




2kmkBZnBBTtm + 0{tlJ 






tm + 0{tl) 


2kBTm 
km. 





M2 


















M2 




2kmkBT,nBB^trn. + O(t^) 




tm + 0{tl) 


2kBTm 
km 


t-i + 0(l) 



where 1/C = B^B, is obtained. Hence, there is a direct trade-off between 
the accuracy of estimation and the perturbation in the potential, indepen- 
dently on (small) tm and admittance k^- It is seen that the more "capaci- 
tance" (C) S has, the less important the trade-off is. One can interpret C 
as a measure of the physical size or inertia of the system. The trade-off is 
more important for "small" system in "hot" environments. Using an optimal 
filter, the trade-off is satisfied with equality. 



4.3 Memory less active measurement device 

A problem with the device A^i is that it causes back action b even in the 
most ideal situation. If active elements are allowed in the measurement 
device, this perturbation can apparently be easily eliminated, but of course 
with the inherent costs of an active device. Consider the measurement device 
M2 to the right in Fig. [3l It is modeled by 



M2 ■■ < 



' Umit) = kmy{t), 

u{t) = Umit) - kmy{t) = 0, 
Um{t) 



ym{t) 



yit), 



where an active element —km exactly compensates for the back action in 
Ml- It is clear that there is no back action and no estimation error using 
this device, 

b{tm) = 0, e{tm) = 0, 



for all tm- Next, a lossless approximation of M2 is performed. 
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4.3.1 Lossless realization M2 



Let the dissipative element km in A^2 be implemented with a linear lossless 
system, see Proposition^ and the active element —k^ be implemented using 
the nonlinear lossless system in (|28|) . This approximation of A^2 captures 
the reasonable assumption that the measurement device must be charged 
with energy to behave like an active device, and that its linear dissipative 
element satisfies the fluctuation-dissipation theorem. 

Assume that the temperature of the measurement device A^2 is and 
the deterministic part of the total energy of the active element is E^- Then 
the interconnected system becomes 



where X2 is the state of S, and Xr is the state of the active element. Using 
the closed- form solution (|29 p - (|30p to eliminate Xr, we can also write the 
equations as 



X2it) = {J -kmBB^)x2{t) 




SM2 : < 



Xr-(O) = \/2Em + AXrO, 

EAx^o = 0, EAx^o = ^sTr 




(0 T^T . 2kBTm 

ym{t) = —r = ^ X2{t) + W — n{t) 

'^m V rCm 




SM2 : < 



+ Bwd{t)-By^2kM^n{t),X2{0)=xo, (39) 



Vmit) 




with the deterministic perturbation Wd{t) 




f^t + 0(t2). The solution to 
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([39]) can be expanded as 



X2{t) =xo- ^j2k^kBT„,BN{t) 



xB f N{s)ds + 5^1^*2 ^ ^(^2)^ ^4Q) 



where N{t) = J^n{s)ds = 0{y/t) is integrated white noise (a Brownian 
motion). It can be seen that the white noise disturbance n is much more 
important than the deterministic disturbance Wd- The back action becomes 

b{tm) = X2{tm) - x{tm) = hd{tm) + bs{tm) 

,2 3 



mJi 



bs{tm) ■■= X2{tm) - ^X2{tm) 



= -■\/2kmkBTrnBN{tm) + By^tm 

where we used that the covariance between Ax^o and is zero. The covari- 
ance of the back action becomes 

P{tm) := Ebs{t^)bs{tn,f = 2k^kBTmBB'^t^ + O(t^). (41) 

It is seen that the dominant term in the stochastic back action is the same 
as for A^i, but the deterministic back action hd is much smaller. 

Remark 8. Using a nonlinear lossless approximation of—k^ of order larger 
than one, we can make the deterministic back action smaller for fixed Em, 
at the expense of model complexity. 

The measurement noise in SAi2 is the same as in SMi, and we can 
essentially repeat the argument from Section [4.2.11 The difference between 
SM.2 and SMi lies in the dynamics. In SAi2, the system matrix is J + 
hnj^xS-Qj^T and there is a deterministic perturbation Wd{t). To make an 

estimate y{tm), knowledge of ymit) in the interval [0, im] is assumed. If 
we assume that the model {J, B,km,Tm) is known plus that the observer 
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somehow knows Wd{t) and Ax^Oj then the optimal estimate again has the 
error covariance M*(tm) = "^t^J"^ +0(1). Any other estimator that has less 
information available must be worse, so that 

M(i)>M*(U = ^M^ + 0(l). 

Again, we have the trade-off ()38p 

|Ay(t^)||Ay(t„)| > IkeT^/C + 0{tm), 

which holds even though we have inserted an active element in device. The 
only effect of the active element is to eliminate the deterministic back action. 

4.4 Summary and Discussion 

The back action and estimation error of the measurement devices are sum- 
marized in Table [TJ For the ideal devices Aii and A^2 no real trade-offs 
exist. However, if we realize them with lossless elements very reasonable 
trade-offs appear. It is only in the limit of infinite available energy and zero 
temperature that the trade-offs disappear. The deterministic back action 
can be made small with large Em, charging the measurement device with 
much energy. However, the effect of stochastic back action is inescapable for 
both Ail and Ai2i and the trade-off 

\Ay\\Ay\>2kBT„JC for smah t^, (42) 

holds in both cases. The reason for having short measurements is to mini- 
mize the effect of the back action. The lower bound on the estimation error 
^* (tm) tends to zero for large tm, but at the same time the measured system 
S tends to a thermodynamic equilibrium with the measurement device. 

It is possible to increase the estimation accuracy by making the admit- 
tance km of the measurement device large, but only at the expense of mak- 
ing a large stochastic perturbation of the measured system. Hence, we have 
quantified a limit for the observer effect discussed in the introduction of this 
section. We conjecture that inequalities like (j42p hold for very general mea- 
surement devices as soon as the dissipative elements satisfy the fluctuation- 
dissipation theorem. Note, for example, that if a lossless transmission cable 
of admittance km and of temperature Tm is used to interconnect the system 
S to an arbitrary measurement device A4, then the trade-off (j42p holds. The 
deterministic back action, on the other hand, is possible to make smaller by 
using more elaborate nonlinear lossless implementations. 
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5 Conclusions 



In this paper, we constructed lossless approximations of both dissipative 
and active systems. We obtained an if-and-only-if characterization of linear 
dissipative systems (linear lossless systems are dense in the linear dissipative 
systems) and gave explicit approximation error bounds that depend on the 
time horizon, the order, and the available energy of the approximations. We 
showed that the fluctuation-dissipation theorem, that quantifies macroscopic 
thermal noise, can be explained by uncertainty in the initial state of a linear 
lossless approximation of very high order. We also saw that using these 
techniques, it was relatively easy to quantify limitations on the back action 
of measurement devices. This gave rise to a trade-off between process and 
measurement noise. 

6 Appendices 

6.1 Proof of Theorem [2] 

We first show the 'only if direction. Assume the opposite: There is a 
lossless approximation Gn that satisfy (I12p for arbitrarily small e > 
even though G is not dissipative. From Proposition [1] it is seen that we 
can without loss of generality assume Gn has a minimal realization and 
xq = 0. If G is not dissipative, we can find an input u{t) over the in- 
terval [0, r] such that y{t)^u{t)dt = —Ki < 0, i.e., we extract energy 
from G even though its initial state is zero. Call ||'u||j;^^[o^t-] = ^2- We have 
foiUNit) — y{t))'^u{t)dt < eK2, by the assumption that a lossless approxi- 
mation Gjsf exists and using the Cauchy-Schwarz inequality. But the lossless 
approximation satisfies Jq y]\f{t)^u{t)dt = ^x{t)'^x{t), since xq = 0. Hence, 
— y{t)'^u{t)dt = Ki < eK2 — ^X]\f{T)'^XN{T) < eK2. But since e can be 
made arbitrarily small, this leads to a contradiction. 

To prove the 'if direction we explicitly construct a Gat that satisfies ([12]), 
when G is dissipative. It turns out that we can fix the model parameters 
D = in Gisf. Furthermore, we must choose xq = since otherwise the zero 
trajectory y = cannot be tracked (see above). We thus need to construct 
a lossless system with impulse response gN{t) such that — gN\\L2[o,To] ^ ^> 
where we have denoted the time interval given in the theorem statement by 
[0, To]. Note that we can increase this time interval without loss of generality, 
since if we prove \\g - gNhilo^r] < e then - 5'7v||l2[o,to] < e, if t > tq. 
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Let us define the constants 



Ci > \\g{t)\\2, t>0; C2= WmWidt; 



oo 







oo 



r I II mil r 4C1 + 2C2 , 4C3 

Jo To 

which are ah finite by the assumptions of the theorem. It will become clear 
later why the constants are defined this way. 

Next let us fix the approximation time interval [0, r] such that 

^M:=f lk(t)ll,rf*<5^. (43) 

where r > tq. Such a r always exists since 5{t) is a continuously decreasing 
function that converges to zero. The lossless approximation is achieved by 
truncating a Fourier series keeping terms. Let us choose the integer 
such that 

AT < < iv + 1, (44) 

where r is fixed in (|43|) . We proceed by constructing an appropriate Fourier 
series. 

6.1.1 Fourier expansion 

The extended function g{t) G -L2(— 00,00) of g{t) is given by 

_g{-tY, t<0. 
Let us make a Fourier expansion of g{t) on the interval [— r, r], 



m 



9r{t) := -Aq + ^^Ak cos h Bk sin ■ 



k-Kt 

9t.". ./ 

I ^— ' r r 

with convergence in L2[— r, r]. For the restriction to [0,r] it holds that 
1 1 (7 — 5t||l2[o,t] = 0- The expressions for the (matrix) Fourier coefficients are 



Ak = - [ {g{t) + g{tf) cos —dt 
r Jo T 

1 kirt 

Bk = - (git) - gitf) sin dt. 

T Jo T 



(45) 
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Note that Ak,Bk G M^^^, and are symmetric {A^ = Aj!) and are 
anti-symmetric [B^ = —BJ). Parseval's formula becomes 



\9r\\l,[o,r] = 'Tr git)g{tfdt 



jTv A^Ao + lf2^ "^kAk + Tr BlB,. (46) 



fc=i 



We also need to bound \\Ak - jBk\\l = Tr AlAk + Tr B^Bk- It holds 
1 



Ak - jBk 



9 



^-T^iairf - g{r)) + ^(5(0) - <7(0)^) 



jkTT Jo 

using integration by parts. Then 

\\Ak -jBkh 



< 



4Ci 



T- / mmidt < 



Furthermore, 



\Ak-jB, 



k\\2 



i ~g{t)e-^^^^/^dt 



2C3 
< — - 
2 ^0 



since r > tq. If the former bound is multiplied by k and the latter is 
multiplied by two and they are added together, we obtain 



\Ak-3Bk\\2 < 



C 



2 + k' 



A; > 0, 



(47) 



where C was defined above. 



6.1.2 Lossless approximation Gat 

Let us now truncate the series grit) and keep the terms with Fourier coef- 
ficients Aq, . . . , ^AT-i and -Bi, . . . , Bjy^i. The truncated impulse response 
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can be realized exactly by a finite-dimensional lossless system iff ^ and 
Afc — jBk > 0, A; = 1, . . . , — 1, see Theorem 5]. But these inequalities 
are not necessarily true. We will thus perturb the coefficients to ensure the 
system becomes lossless and yet ensure that the L2-approximation error is 
less than e. 

We quantify a number ^ > that ensures that Ak — jB^ + ilp > for 
all k. Note that by the assumption of G being dissipative, it holds that 



9{3^) + 9{-j^V = / me^'^'dt > 0. 



oo 



oo 



Remember that f^^g{t)e ^'^^^^'^dt = tA^ — jrBk, and therefore 

Ak - jBk + Afc > 

where Afe := i g{t)e-^'^^^/'' + g{tf e^'^'^^/'^dt. The size of Afc can be 
bounded and we have 



||Afc||2 = VTr A* Afc < - / \\git)\\idt < 

^ J T 

using (|43p . Thus we can choose 



e2 



e2 



and Ak — jB^ + ^/p > for all A;, since p{^k) ^ ll^fclb- 
Next we verify that a system with impulse response 



gN{t) := -{Ao + CIp) 



{Ak + ilp) cos + Sfe sin , (48) 

T r 

fc=i 

where r, iV, ^ are fixed above satisfies the statement of the theorem. By the 
construction of ^, Gat is lossless. It remains to show that the approximation 
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error \\g — gN\\L2[o,T] is less than e. Using Parseval's formula (06]), it holds 



I l|2 II ~ ||2 

\9 - 9n\\l2[o,t] = \\9t - 9n\\l2[o,t] 

N-l 



1 ■r-^ knt s—^ knt knt 
-^I + y t,I cos \- y Ak cos 1- Bk sm 

/ Z / T Z / T T 



fc = l 



r 

C2 



fc=Af 



< -- 



2 ^ (2 + A:)2 
r C72 £2 rC2e2 



2 £2 r2C2p^ 2 iV + 1 - 2 2 rC2 
where the bounds (j44p and ()47p are used. The result has been proved. 

6.2 Proof of Theorem d 

We first show the 'if direction. Then there exists a lossless and time- 
reversible (with respect to Sg, see Definition [2]) approximation Gn of G. 
Theorem [2] shows that G is dissipative. Theorem 8 in [28] shows that Gn 
necessarily is reciprocal with respect to Sg. Since Gat is an arbitrarily good 
approximation it follows that G also is reciprocal, which concludes the 'if 
direction of the proof. 

Next we show the 'only if direction. Then G is dissipative and recip- 
rocal with respect to Sg. Theorem [2] shows that there exists an arbitrarily 
good lossless approximation Gn, and we will use the approximation ()48p . 
That G is reciprocal with respect to Sg means that Se5'(t) = g{t)'^Tje, see 
Definition [TJ Using this and the definition of and in (I45p , it is seen 
that 



Thus the chosen Gn is also reciprocal, Tje9N{t) = gN{t)^'^e, and Theorem 8 
in j28j shows Gat is time reversible with respect to Sg. This concludes the 
proof. 
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